library(readr)
library(momentuHMM)
library(ggplot2)
library(dplyr)
library(lubridate)
library(data.tree)
library(DiagrammeR)The goal of this tutorial is to explore how to fit hidden Markov models to accelerometer data, how to incorporate covariates into the transition probabilities, and the implementation of hierarchical Markov models to animal movement data. For the first two objectives, we will use 4 days of acceleration data obtained from a free-ranging blacktip reef shark at Palmyra Atoll in the central Pacific Ocean (data taken from Leos-Barajas et al. 2017).
Accelerometer devices measure up to three axes, which can be described relative to the body of the animal: longitudinal (surge), lateral (sway) and dorsoventral (heave). These devices are becoming more prevalent in the fields of animal biologging data as they provide a means of measuring activity in a meaningful and quantitative way. From tri-axial acceleration data, we can also derive several measures that summarize effort or exertion and relate acceleration to activity levels such as overall dynamic body acceleration (ODBA) and vectorial dynamic body acceleration (VeDBA). These metrics can be used to reduce the dimensionality of three-dimension acceleration data while retaining important information. Further, because acceleration data is often at high temporal resolutions over time, it also naturally exhibits a large degree of autocorrelation, making it impossible to assume independence between sequential observations. As we have learned, HMMs can account for the autocorrelation present in the data while assuming that the data were generated according to a finite set of (unobserved) behaviors making them a good candidate model for this type of data structure. Today, we will fit an HMM to the ODBA of a blacktip shark, calculated every second.
For the blacktip shark, we have time of day, water temperature, depth and ODBA. Since one of our goals is to use the time of observations (second of the day) as a covariate, we also need to extract this information from the time variable.
# Reading the data
BlacktipB <- read_delim("data/BlacktipB.txt",
delim = "\t", escape_double = FALSE,
trim_ws = TRUE)
# Let's transform the date/time info into a proper time format
# In this case we need to extract the second of the day correspondent to each observation
BlacktipB = BlacktipB %>%
mutate(Time = as.POSIXct(Time,format = "%m/%d/%Y %H:%M")) %>%
mutate(hour_to_sec = as.integer(seconds(hm(format(Time, format = "%H:%M"))))) %>%
group_by(Time) %>% mutate(sec = row_number()) %>% ungroup() %>%
mutate(sec = case_when(hour_to_sec == 53160 ~ as.integer(sec + 55),
TRUE ~ sec),
hour_to_sec = hour_to_sec + sec)
head(BlacktipB)## # A tibble: 6 × 6
## Time Temp Depth ODBA hour_to_sec sec
## <dttm> <dbl> <dbl> <dbl> <int> <int>
## 1 2013-07-10 14:46:00 30.1 0.3 0.0672 53216 56
## 2 2013-07-10 14:46:00 30.1 0.3 0.0298 53217 57
## 3 2013-07-10 14:46:00 30.1 0.3 0.0422 53218 58
## 4 2013-07-10 14:46:00 30.1 0.3 0.0748 53219 59
## 5 2013-07-10 14:46:00 30.1 0.3 0.0442 53220 60
## 6 2013-07-10 14:47:00 30.1 0.3 0.0995 53221 1
Looking at the ODBA values through the observed period, we find ODBA is unusually high at some times – for this shark we assumed that values between 0 and 2 were consistent with what we expected. Because accommodating extreme values can pose a problem for identification of an adequate state-dependent distribution in our HMM, we removed them from the data set. However, note that in general, deciding whether to remove extreme values or not will more likely depend on whether we find appropriate distributional forms that can accommodate them. Generally, we need to make sure that extreme values are in fact some artefact of the data collection process, not representative of a behavior of interest, or inconsistent with what we are trying to capture as well. Removing data is not good general practice but instead we can assess on a case-by-case basis.
We can see the original time series here across the four days:
BlacktipB %>% ggplot(aes(Time,ODBA)) + geom_line()BlacktipB = BlacktipB %>% filter(ODBA <= 2.0)And now, the modified time series with values above 2.0 removed. Now, we are ready to start to look for models for this data!
Given our data, we are interested in finding possible behaviours through the observation process (ODBA). Let’s take a quick look at the histogram of the observations.
As we have indicated before, the best way to start when fitting a hidden Markov model is to keep things simple. In this case, we will be considering 2 behavioral states, with gamma state-dependent distributions, and no covariates for the transition probability matrix. As mentioned in previous tutorials, now is time to implement the decisions that we have made so far. For the choice of initial parameter values we can take a quick peak at the data (e.g., using the plots above). From the plots above, we specify the means of our state-dependent distributions as 0.1 and 0.3.
Now that the data is ready for modeling, we choose to fit a 2-state hidden Markov model. For this purpose, we first need to assign the class momentuHMMData to the data in order to be presentable for the functions related to momentuHMM.
BlacktipBData = prepData(BlacktipB,coordNames = NULL,
covNames = "hour_to_sec")Let’s fit our model and take a look at the output of the fitted model.
fit1 = fitHMM(BlacktipBData,nbStates=2,dist=list(ODBA="gamma"),Par0 = list(ODBA=c(.1,.3,1,1)))
fit1## Value of the maximum log-likelihood: 632931
##
##
## ODBA parameters:
## ----------------
## state 1 state 2
## mean 0.12024871 0.05092870
## sd 0.07916712 0.02789225
##
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
## 1 -> 2 2 -> 1
## (Intercept) -2.520212 -3.78974
##
## Transition probability matrix:
## ------------------------------
## state 1 state 2
## state 1 0.92554670 0.0744533
## state 2 0.02210194 0.9778981
##
## Initial distribution:
## ---------------------
## state 1 state 2
## 0.4999013 0.5000987
We can also plot the results to obtain a visual representation of the fitted model.
plot(fit1,breaks = 80)## Decoding state sequence... DONE
Let’s include retryFits in the fitHMM function. This can take time to run.
set.seed(147)
fit1_s2 <- fitHMM(BlacktipBData,
nbState = 2,
dist=list(ODBA="gamma"),
Par0 = list(ODBA=c(.1,.3,1,1)),
retryFits=10)## Attempting to improve fit using random perturbation. Press 'esc' to force exit from 'fitHMM'
##
Attempt 1 of 10 -- current log-likelihood value: 632931
Attempt 2 of 10 -- current log-likelihood value: 632931
Attempt 3 of 10 -- current log-likelihood value: 632931
Attempt 4 of 10 -- current log-likelihood value: 632931
Attempt 5 of 10 -- current log-likelihood value: 632931
Attempt 6 of 10 -- current log-likelihood value: 632931
Attempt 7 of 10 -- current log-likelihood value: 632931
Attempt 8 of 10 -- current log-likelihood value: 632931.5
Attempt 9 of 10 -- current log-likelihood value: 632931.5
Attempt 10 of 10 -- current log-likelihood value: 632931.5
Seems nothing changed at all!
Now let’s go further and include a high perturbation in one of the initial values (instead of .1 and .3, let’s do .1 and 2). Do we still have similar estimated coefficients and log likelihood? (no matter the initial values, the coefficients should be similar)
fit1_s2_long <- fitHMM(BlacktipBData,
nbState = 2,
dist=list(ODBA="gamma"),
Par0 = list(ODBA=c(.1,1,1,1)))
fit1_s2_long## Value of the maximum log-likelihood: 632931
##
##
## ODBA parameters:
## ----------------
## state 1 state 2
## mean 0.12024868 0.05092870
## sd 0.07916712 0.02789225
##
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
## 1 -> 2 2 -> 1
## (Intercept) -2.520212 -3.789738
##
## Transition probability matrix:
## ------------------------------
## state 1 state 2
## state 1 0.92554667 0.07445333
## state 2 0.02210198 0.97789802
##
## Initial distribution:
## ---------------------
## state 1 state 2
## 0.4998976 0.5001024
Let’s look at the pseudo-residuals.
plotPR(fit1)## Computing pseudo-residuals...
We can see that there is high autocorrelation and that some deviation from normality.
We can also compute the most likely sequence of states. What can we infer from this? Is there something else we can say from this? According to fitted model, can we see if there is any interesting pattern?
# identify most likely state using the Viterbi algorithm
BlacktipB$state <- viterbi(fit1)
# proportion of the behaviour states during the observed period
table(BlacktipB$state)/length(BlacktipB$state)##
## 1 2
## 0.2088546 0.7911454
BlacktipB %>% mutate(day = day(Time)) %>% ggplot(aes(Time,state)) + facet_wrap(~day,scales = "free_x") + geom_point()As in Leos-Barajas et al. 2017, we can incorporate other information that may help explain the values of ODBA. In this case, we consider the second of the day of every observation. Time of day is represented by two trigonometric functions with period 24 h, \(cos(2\pi t/86,400)\) and \(sin(2\pi t/86,400)\) (86 400 is the number of seconds in a day). Using the function cosinor, we can convert our data stream to something that is useful for us. As well, we need to provide the formula corresponding to the regression that will be stored in the transition probability values.
# formula corresponding to the regression coefficients for the transition probabilities
formula = ~ cosinor(hour_to_sec, period = 86400)
Par0_fit2 <- getPar0(model=fit1, formula=formula)
fit2 = fitHMM(BlacktipBData,nbStates=2,dist=list(ODBA="gamma"),Par0 = Par0_fit2$Par,formula=formula)
fit2## Value of the maximum log-likelihood: 633197
##
##
## ODBA parameters:
## ----------------
## state 1 state 2
## mean 0.12067854 0.05095369
## sd 0.07935089 0.02790663
##
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
## 1 -> 2 2 -> 1
## (Intercept) -2.4297451 -3.80723087
## cosinorCos(hour_to_sec, period = 86400) -0.2046436 0.41186852
## cosinorSin(hour_to_sec, period = 86400) 0.1674542 -0.07324905
##
## Transition probability matrix (based on mean covariate values):
## ---------------------------------------------------------------
## state 1 state 2
## state 1 0.90262980 0.0973702
## state 2 0.01450939 0.9854906
##
## Initial distribution:
## ---------------------
## state 1 state 2
## 0.4996927 0.5003073
plot(fit2,breaks=80)## Decoding state sequence... DONE
Let’s explore the results. Do the coefficients vary much? What about the ACF? Did the autocorrelation decrease with this innovation?
plotPR(fit2)## Computing pseudo-residuals...
We can also take a quick look at the Akaike information criteria (AIC) for the two models to do a comparison.
AIC(fit1)## [1] -1265848
AIC(fit2)## [1] -1266372
Tiger shark data were collected in Oahu, Hawai’i. The shark’s depth was recorded in 0.5 m intervals every 2 s over a 23 day period, from March 9 to March 31, 2009. On some days, the tiger shark inhabited varied ranges of depth levels and performed many dives whereas other days it remained relatively constant in the water column and dove less. For sharks, movement in the water column is not easily segmented into types of dives. However, dives, or simply ascending or descending, can be an important part of a shark’s behavior. We extracted a depth position every ten minutes from the available data record and sequentially computed the absolute change in depth position, \(y^*_t = |d_t - d_{t-1}|\), for the tiger shark in order to understand how the depths inhabited by the shark differ across days. On some days the tiger shark tends to remain in the same part of the water column for long periods of time while on others, it tends to move up and down more frequently throughout the day. We processed the data to produce \(M=144\) observations per day, across \(K=21\) days, excluding the first and last days for which only partial data records were available.
# Read the data
tigerShark <- read_csv("data/tigershark_depthchange10min.csv")
tigerShark = tigerShark %>% mutate(abs_change_depth = abs(Depth - lag(Depth,default = 0)))
head(tigerShark)## # A tibble: 6 × 13
## Date Depth Light ETemp DateTime HST days minutes
## <dbl> <dbl> <dbl> <dbl> <dttm> <dttm> <dbl> <dbl>
## 1 39882. 4 197 21.6 2009-03-10 01:10:00 2009-03-09 15:10:00 9 910
## 2 39882. 4 200 21.4 2009-03-10 01:20:00 2009-03-09 15:20:00 9 920
## 3 39882. 10 188 21.4 2009-03-10 01:30:00 2009-03-09 15:30:00 9 930
## 4 39882. 17 174 21.3 2009-03-10 01:40:00 2009-03-09 15:40:00 9 940
## 5 39882. 14.5 180 21.4 2009-03-10 01:50:00 2009-03-09 15:50:00 9 950
## 6 39882. 13.5 181 21.2 2009-03-10 02:00:00 2009-03-09 16:00:00 9 960
## # … with 5 more variables: five <dbl>, ten <dbl>, second <dbl>, depthc <dbl>,
## # abs_change_depth <dbl>
tigerShark %>%
filter(days != 9) %>%
ggplot(aes(x=HST,y=Depth)) + facet_wrap(~days,scales = "free_x") + geom_line() +
scale_x_datetime(breaks= "8 hour", date_labels = "%H:%M") + theme_minimal()tigerShark %>% filter(days != 9) %>%
ggplot(aes(x=HST,y=2*abs_change_depth)) + facet_wrap(~days,scales = "free_x") + geom_line() +
scale_x_datetime(breaks= "8 hour", date_labels = "%H:%M") + theme_minimal()Generate new data streams from original datasets (overall mean in different time periods). Does the ACF changes between these data sets? How are they compared to the original data set? What can you say about this?
BlacktipB = BlacktipB %>%
mutate(group_10s = case_when(sec <= 10 ~ 1,
10 < sec & sec <= 20 ~ 2,
20 < sec & sec <= 30 ~ 3,
30 < sec & sec <= 40~ 4,
40 < sec & sec <= 50 ~ 5,
50 < sec & sec <= 60 ~ 6,
TRUE ~ -1),
group_30s = case_when(sec <= 30 ~ 1,
30 < sec & sec <= 60 ~ 2,
TRUE ~ -1)) %>%
group_by(Time,group_10s) %>% mutate(ODBA_mean_10s = mean(ODBA)) %>% ungroup() %>%
group_by(Time,group_30s) %>% mutate(ODBA_mean_30s = mean(ODBA)) %>% ungroup() %>%
group_by(Time) %>% mutate(ODBA_mean_60s = mean(ODBA)) %>%
mutate(hour_to_sec_10s = floor((hour_to_sec-1)/10+1),
hour_to_sec_30s = floor((hour_to_sec-1)/30+1),
hour_to_sec_60s = floor((hour_to_sec-1)/60+1))
BlacktipB_10s = BlacktipB %>% group_by(Time,group_10s) %>% filter(row_number() == 1) %>%
ungroup() %>% select(Time,Depth,ODBA_mean_10s,hour_to_sec_10s)
BlacktipB_30s = BlacktipB %>% group_by(Time,group_30s) %>% filter(row_number() == 1) %>%
ungroup() %>% select(Time,Depth,ODBA_mean_30s,hour_to_sec_30s)
BlacktipB_60s = BlacktipB %>% group_by(Time) %>% filter(row_number() == 1) %>%
ungroup() %>% select(Time,Depth,ODBA_mean_60s,hour_to_sec_60s)
BlacktipB = BlacktipB %>% select(Time,Depth,ODBA,hour_to_sec)Fit a 3-state HMM with no covariates and fit another one using time as a covariate. What happents to the ACF? According to loglikelihood and AIC, which model is better?